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Part I 

Preface 



The solution of the time independent Schrodinger equation for molecular sys- 
tems requires the use of modern computers, because analytic solutions are not 
available. 
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This review deals with some of the methods known under the umbrella term 
quantum Monte Carlo (QMC), specifically those that have been most commonly 
used for electronic structure. Other appHcations of QMC are widespread to 
rotational and vibrational states of molecules, such as the work of ||^, ||, ||, ^, 
condensed matter physics |||, and nuclear physics ||^, |^. 

QMC methods have several advantages: 



Computer time scales with system size roughly as , where N is the num- 
ber of particles of the system. Recent developments have made possible 
the approach to linear scaling in certain cases. 

Computer memory requirements are small and grow modestly with system 
size. 

QMC computer codes are significantly smaller and more easily adapted to 
parallel computers than basis set molecular quantum mechanics codes. 

• Basis set truncation errors are absent in the QMC formaHsm. 

• Monte Carlo numerical efficiency can be arbitrarily increased. QMC cal- 
culations have an accuracy dependence of VT, where T is the computer 
time. This enables one to choose an accuracy range and readily estimate 
the computer time needed for performing a calculation of an observable 
with an acceptable error bar. 

The purpose of the present work is to present a description of the commonly 
used algorithms of QMC for electronic structure and to report some recent 
developments in the field. 

The paper is organized as follows. In Sec. ||, we provide a short introduction 
to the topic, as well as enumerate some properties of wave functions that are 
useful for QMC applications. In Sec. ffl^ we describe commonly used QMC 



algorithms.. In Sec. |V|, we briefiy introduce some special topics that remain 
fertile research areas. 

Other sources that complement and enrich the topics presented in this chap- 
ter are our previous monograph, ||^ and the reviews of jl^, 0, |l3[ 
|l6| , [l^ , ^. There are also chapters on QMC contained in selected computational 
physics texts ^n, ll^, M . Selected appHcations of the method are contained in 
Refs. in H g pf |6l 1^ H |§, 1^, H. 

QMC methods that are not covered in this review are the auxiliary field 
QMC method IH, H, H, H] and path integral methods fs^, 1^. 

Atomic units are used throughout, the charge of the electron, e and Planck's 
normahzed constant, h are set to unity. In this metric system, the unit distance 
is the Bohr radius, an. 
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Part II 

Introduction 



The goal of the quantum Monte Carlo (QMC) method is to solve the Schrodinger 
equation, which in the time independent representation is given by, 

H^nCR) = £^„*„(R) (1) 

Here, H is the Hamiltonian operator of the system in state n, with wave 
function ^'„(R) and energy R is a vector that denotes the 3A^ coordinates 
of the system of N particles (electrons and nuclei), R = {ri,...,r„}. For 
molecular systems, in the absence of electric or magnetic fields, the Hamiltonian 
has the form H = T + V , where T is the kinetic energy operator, T = — = 
— ^iid V is the potential energy operator. For atomic and molecular 

systems V is the Coloumb potential between particles of charge qi,V = |^ . 

The first suggestion of a Monte Carlo solution of the Schrodinger equation 
dates back to Enrico Fermi, based on Metropolis, and Ulam |Q. He indicated 
that a solution to the stationary state equation, 

- iv2,*(R) = E^iK) - V{K)^{R) (2) 

could be obtained by introducing a wave function of the form, \I>(R, r) = 
4'(R)e~^^. This yields the equation, 

^ffii^==iv2vl/(R,r)-l^(R)v|/(R,r) (3) 

Taking the limit r ^ oo, Eq. || is recovered. If the second term on the right 
hand side of Eq. § is ignored, the equation is isomorphic with a diffusion equa- 
tion, which can be simulated by a random walk |^, where random walkers 
diffuse in a R-dimensional space. If the first term is ignored, the equation is 
a first-order kinetics equation with a position dependent rate constant, T^(R), 
which can also be interpreted as a stochastic survival probability. A numeri- 
cal simulation in which random walkers diffuse through R-space, reproduce in 
regions of of low potential, and die in regions of high potential leads to a sta- 
tionary distribution proportional to ^(R), from which expectation values can 
be obtained. 



1 Numerical solution of the Schrodinger equation 

Most efforts to solve the Schrodinger equation are wave function methods. These 
approaches rely exclusively on linear combinations of Slater determinants, and 
include configuration interaction (CI) and the multi-configuration self-consistent 
field (MCSCF). There are perturbation approaches including the Moller-Plesset 
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series (MP2, MP4), and coupled cluster (CC) theory, which are presently pop- 
ular computational procedures. Wave function methods suffer from scaling de- 
ficiencies. An exact calculation with a given basis set expansion requires A^! 
computer operations, where N is the the number of basis functions. Com- 
petitive methods such as coupled cluster with singles and doubles, and triples 
perturbation treatment, CCSD(T), scale as iV^ 

A term that we will use later is correlation energy (CE). It is defined as the 
difference between the exact non-relativistic energy, and the energy of a mean 
field solution of the Schrodinger equation, the Hartree-Fock method, in the limit 



of an infinite basis set ||44[ ^ 



Ecorr ~ Eexact ~ EhF , (4) 

The CI, MCSCF, MP(N), and CC methods are all directed at generating ener- 
gies that approach Eexact- 

Other methods that have been developed include dimensional expansions 
and the contracted Schrodinger equation method |^^. For an overview of 
quantum chemistry methods, see Ref. 

Since the pioneering work of the late forties to early sixties fs^ , po] , |50| , 
the MC and related methods have grown in interest. QMC methods have an 
advantage with system size scaHng, in the simpHcity of algorithms and in trial 
wave function forms that can be used. 



2 Properties of the exact wave function 

The exact time independent wave function solves the eigenvalue equation |l|. 
Some analytic properties of this function are very helpful in the construction of 
trial functions for QMC methods. 

For the present discussion, we are interested in the discrete spectrum of the 
H operator. In most appHcations the total Schrodinger equation |l| can be rep- 
resented into an "electronic" Schrodinger equation and a "nuclear" Schrodinger 
equation based on the large mass difference between electrons and nuclei; This 
is the Born-Oppenheimer (BO) approximation. Such a representation need not 
be introduced in QMC but here is the practical benefit of it that the nuclei can 
be held fixed for electronic motion results in the simplest form of the electronic 
Schrodinger equation. 

The wave function also must satisfy the virial, hypervirial Hellman-Feynman 
and generalized Hellman-Feynman theorems |5^ , [s^ ^ . The local energy 



^For a more detailed analysis of the scaling of wave function based methods see, for example, 
and |Ui For a general overview of these methods, the reader is referred to Chapter I of 
fs book p3fl. 

^The Hellman-Feynman theorem is discussed in Sec. S.2.3. 
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is a constant for the exact wave function. 

When charged particles meet, there is a singularity in the Coulomb potential. 
This singularity must be compensated by a singularity in the kinetic energy, 
which results in a discontinuity in the first derivative, i. e., a cusp, in the wave 
function when two or more particles meet 1 55[ 56| . For one electron coalescing at 



a nucleus, if we focus in a one electron function or orbital (j){r) = x{r)Y/^{9, </>), 
where x(^) is a radial function, and Y[^{6,4>) is a spherical harmonic with 
angular and magnetic quantum numbers I and to, the electron-nucleus cusp 
condition is 

1 dq{r) _ Z 

where ri{r) is the radial wave function with the leading r dependence factored 
out, rj{r) = x(^)/'''")) and Z is the atomic number of the nucleus. 

For electron-electron interactions, the cusp condition takes the form, 

1 drjij (r) I 1 



?7,j(r) dr 2(Z + 1)' 

where rjij (r) is the r™ factored function for the electron-electron radial dis- 
tribution function. 

Furthermore, p(r), the spherical average of the electron density, p(r) ^, must 
satisfy another cusp condition, namely, 

j-p{r)\r^^ = -2Zp{r) (8) 
or 

at any nucleus. Another condition on p(r) is that asymptotically, it decays 
exponentially, 

p(r ^ oo) « (9) 

where lo is the first ionization potential. This relation can be derived from 
consideration of a single electron at large distance. Details on these requirements 
can be found in refs. jH^ and 

We discuss how to impose properties of the exact wave function on QMC 
trial functions in Sec. [H 

2.0.1 Approximate wave functions 

James and Coolidge [|9| proposed three accuracy tests of a trial wave function, 
^T'- the root mean square error in '^t, 

5^ = [j i^T --^ofdR]^ (10) 



■'if A'' is the number of electrons, then p(r) is defined by 

p(r) = N I l'^(R)l2dR 
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the energy error, 



Se = Ei^r) - Eo (11) 
and the root mean square local energy deviation, 

5E, = [j \{H-E,m^<m\^ (12) 
where the local energy is defined as in Eq. ||. 

The calculation of (5* by QMC requir es s ampling the exact wave function, a 
procedure that will be described in Sec. |8.l| . 

Several stochastic optimization schemes have been proposed for minimizing 
expressions [lo|-|l2|. Most researchers have focused on |l^, i.e., minimizing 5el\ 
see, for example, McDowell |Q. In Sec. || we turn to stochastic wave function 
optimization procedures. 



Part III 

Algorithms 

In this section, we describe the computational procedures of QMC methods. 
All of these methods use MC techniques widely used in other fields, such as 
operations research, applied statistics, and classical statistical mechanics simu- 
lations. Techniques such as importance sampling, correlated sampling and MC 
optimization are similar in spirit to those described in MC treatises |^, 
|6^ , |60| , |65| , |66| |6^ , |6^ , |69| , |70|| . The reader is referred to the former for more 
details on the techniques described in this Section. 

We present the simple, yet powerful variational Monte Carlo (VMC) method, 
in which the Metropolis MC ^ method is used to sample a known trial function, 
^T- We follow with the projector Monte Carlo (PMC) methods that sample 
the unknown ground state wave function. 



3 Variational Monte Carlo 
3.1 Formalism 

Variational methods involve the calculation of the expectation value of the 
Hamiltonian operator using a trial wave function, '^t- This function is de- 
pendent on a set of parameters. A, that are varied to minimize the expectation 
value, i.e., 

■^This algorithm is also known as the M(RT)^, due to the full list of the authors that 
contriliuted to it's development, Metropolis, Rosenbluth, Rosenbluth, Teller and Teller, see 
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The expectation value |l^ can be sampled from a probability distribution 
proportional to and evaluated from the expression, 

^ ^ > (14) 



/dR 


"H*t(R.) 


vf^R) 


J 


rdR*2,(R) 



/dR*2,(R) 



where is the local energy defined in Sec. 2.0.1. The procedure involves 
sampling random points in R-space from, 

The advantage of using |l^ as the probability density function is that one 
need not perform the averaging of the numerator and denominator of Eq. 
The calculation of the ratio of two integrals with the MC method is biased by 
definition: the average of a quotient is not equal to the quotient of the averages, 
so this choice of -P(R) avoids this problem. 

In general, sampling is done using the Metropolis method j?^, that is well 
described in Chapter 3 of [Q, and briefiy summarized here in Sec. |3.1.l| . 

Expectation values can be obtained using the VMC method from the follow- 
ing general expressions fz^ , 

~ _ /dRv&T(R)^d(R) ^ 1 VPfRi rifil 

= /dRvI/HR)2 =N^ (16) 



/dR 


■6<j*t(R.)' 

*r(R) 






f dR*T(R)2 



N 



Equation is for a coordinate operator, O, andjlj is preferred for a differential 
operator, Od- 

3.1.1 The generalized Metropolis algorithm 

The main idea of the Metropolis algorithm is to sample the electronic density, 
given hereby, ^'^.(R) using fictitious kinetics that in the limit of large simulation 
time yields the density at equilibrium. A coordinate move is proposed, R ^ R', 
which has the probability of being accepted given by, 

P(R^R') = mm(^l,J^^--^i^j, (18) 

where r(R R') denotes the transition probability for a coordinate move from 
R to R'. In the original MetropoHs procedure, T was taken to be a uniform 
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random distribution over a coordinate interval AR. Condition |l^ is necessary 
to satisfy the detailed balance condition, 

r(R' ^ R)*|(R') = T(R ^ RO^tIR) (19) 

which is necessary for ^'|.(R) to be the equilibrium distribution of the sam- 
pling process. 

Several improvements to the Metropolis method have been pursued both in 
classical and in QMC simulations. These improvements involve new transition 
probability functions and other sampling procedures. See, for example, ifzst [z^ , 

A common approach for improving T(R R') in VMC, is to use the quan- 
tum force, 

Fq = VIn|^'T(R)'| (20) 

as a component of the transition probability. The quantum force can be 
incorporated by expanding /(R,t) = |*t(R)^| = e" l*?'^^)! , in a Taylor 
series in ln|^'y(R)|, and truncating at first order, 

T(R -> R') « le^F<.W-(i^'-i^), (21) 

where iV is a normalization factor, and A is a parameter fixed for the simu- 
lation or optimized in some fashion, for example, see ||7^. A usual improvement 
is to introduce a cutoff in AR ~ (R' — R), so that if the proposed displacement 
is larger than a predetermined measure, the move is rejected. 

A good transition probability should also contain random displacements, so 
that all of phase space can be sampled. The combination of the desired drift 
arising from the quantum force of Eq. ^ with a Gaussian random move, gives 
rise to Langevin fictitious dynamics, namely, 

r' ^R+iFq(R) + 55„ (22) 

where Gsr is a number sampled from a Gaussian distribution with standard 
deviation 6t. The propagator or transition probability for Eq. |2| is 

Ti(R ^ R') = ^ _ e^(R'-R-|F,(R)^.)V2^r (23) 

which is a drifting Gaussian, spreading in St. Using Eq. |2^ is equivalent to 
finding the solution of the Fokker-Planck equation , 

^Z^^ lv.(V-Fq)/(R,r), (24) 

Equation ^ has proved to be a simple and effective choice for a VMC tran- 
sition probability. More refined choices can be made, usually with the goal of 
increasing acceptance probabilities in regions of rapid change in |^'t(R)^|, such 
as close to nuclei. For a more detailed discussion of this formaHsm, the reader 
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is directed to Chapter 2 of More elaborate transition rules can be found in 



3.1.2 Statistics 

Usually, VMC calculations are performed using an ensemble of A^v random 
walkers, W = {Ri, R2, . . . , Rn} that are propagated following T(R R') 
using the probability P(R — > R') to accept or reject proposed moves for en- 
semble members. Statistical averaging has to take into account auto-correlation 
between moves that arises if the mean square displacement for the ensemble, 
A(R — » R')^/iVw, is sufficiently large. In such cases, observables measured at 
the points R' will be statistically correlated with those evaluated at R. The 
variance for an observable, O, measured over Ng MC steps of a random walk is, 

^ -(0.-(0», (25) 



^ NsNw 

where (O) is the average of the observations, Oi over the sample. A simple 
approach to remove auto-correlation between samples is to define a number of 
blocks, Ni,, where each block is an average of of Ng steps, with variance, 

..-^(0.-(0)), (26) 

where Ob is the average number of observations Nt in block b. If Nt is 
sufficiently large, ctb is a good estimator of the variance of the observable over 
the random walk. The auto-correlation time is a good measure of computational 
efficiency, and is given by 



T,orr = lim iV J ^ (27) 




The efficiency of a method depends on time step ||8^. Serial correlation 
between sample points should vanish for an accurate estimator of the variance. 
For an observable (O) , the serial correlation coefficient is defined as, 

N-k 

where k is the number of MC steps between the points Oi and Oi+k- The 
function |2^ decays exponentially with k. The correlation length, L, is defined 
as the number of steps necessary for to decay essentially to zero. For an 
accurate variance estimator, blocks should be at least L steps long. 

The efficiency of a simulation is inversely proportional to ^k- The £,k de- 
pendence on time step is usually strong ||^; the larger the time step, the fewer 
steps /block L necessary, and the more points available for calculating the global 
average (O). A rule of thumb is to use an Nt ~ 10 times larger than the 
auto-correlation time to insure statistical independence of block averages, and 
therefore a rehable variance estimate. 
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The VMC method shares some of the strengths and weaknesses of tradi- 
tional variational methods: the energy is an upper bound to the true ground 
state energy. If reasonable trial functions are used, often rehable estimates of 
properties can be obtained. For quantum MC applications, VMC can be used 
to obtain valuable results. In chemical appHcations, VMC is typically used to 
analyze and generate trial wave functions for PMC. 



3.2 Trial wave functions 

In contrast to wave function methods, where the wave function is constructed 
from linear combinations of determinants of orbitals, QMC methods can use 
arbitrary functional forms for the wave function subject to the requirements 
in Sec. |2[ Because QMC trial wave functions are not restricted to expansions 
in one-electron functions (orbitals) , more compact representations are routinely 
used. In this section, we review the forms most commonly used for QMC cal- 
culations. 

Fermion wave functions must be antisymmetric with respect to the exchange 
of an arbitrary pair of particle coordinates. If they are constructed as the 
product of N functions of the coordinates, (/)(ri, r2, . . . tat), the most general 
wave function can be constructed enforcing explicit permutation. 



,(Jir2, (72, ■■■ ,rN, (In), (29) 



n.7n 



where P„ is the n*'* coordinate permutation operator, Pn<p{ri, r2, . . . r^, r^, . . . r^) 
(/)(ri,r2, . . .rj,ri,. . .tn), and Sm(j){cri, a2, C7i, cTj . . . , a^) = 4>{cri, cr2, aj , . . . , ctat) 
is the m*'* spin coordinate permutation operator. 

If the functions (pi depend only on single-particle coordinates, their antisym- 
metrized product can be expressed as a Slater determinant, 

i?(R,E) - -^det|0i,...,(/.,(r,,fT,),...,<^„| (30) 

Trial wave functions constructed from orbitals scale computationally as iV^ , 
where N is the system size, compared to TV! for the fully antisymmetrized form|^. 
The number of evaluations can be reduced by determining which permutations 
contribute to a particular spin state. 

For QMC evaluation of properties that do not depend on spin coordinates, S, 
for a given spin state, the Ml configurations that arise from relabeling electrons, 
need not be evaluated. The reason is that the Hamiltonian of Eq. |^, contains 
no magnetic or spin operators and spin degrees of freedom remain unchanged. 
In this case, and for the remainder of this paper, (T| electrons do not permute 

^The evaluation of a determinant of size requires N'^ computer operations. If the one- 
electron functions scale with system size as well, the scaling becomes N^. In contrast, a 
fully antisymmetrized form requires the explict evaluation of the A*"! permuations, making the 
evaluation of this kind of wave functions in QMC prohibitive for systems of large N. 
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with tJj^ electrons, so that the full Slater determinant (s) can be factored into 
a product of spin-up, and spin-down, determinants. The number of 
allowed permutations is reduced from (iV| + iV|)! to iV|!7V|! Q ||]. 

The use of various wave function forms in QMC has been explored by |Q , 
as well as ||8^. Fully antisymmetric descriptions of the wave function are more 
flexible and require fewer parameters than determinants, but their evaluation is 
inefficient due to the A^! scaling. 

A good compromise is to use a product wave function of a determinant or 
Hnear combination of determinants, e.g., HF, MCSCF, CASSCF, CI, multipHed 
by a correlation function that is symmetric with respect to particle exchange. 



^-T = T^T, (31) 

Here V denotes the antisymmetric wave function factor and T is the sym- 
metric factor. We now describe some of the forms used for T> and then we 
describe forms for !F. Such products are also known as the correlated molecular 
orbital (CMO) wave function. 

In the CMO wave functions, the antisymmetric part of the wave function is 
constructed as a determinant of independent particle functions, (pi (see Eq. |30| ). 
The are usually formed as a Hnear combination of basis functions centered 
on atomic centers, = Y^jCjXj- The most commonly used basis functions 
in traditional ah initio quantum chemistry are Gaussian functions, which owe 
their popularity to ease of integration of molecular integrals. Gaussian basis 
functions take the form, 

XG ^ x'^y^^e-^^" (32) 
For QMC applications, it is better to use the Slater-type basis functions, 

Xs = x'^y'z^e-^' (33) 

because they rigorously satisfy the electron-nuclear cusp condition (see Eq. 
^, and the asymptotic property of Eq. ^ Nevertheless, in most studies, Gaus- 
sian basis functions have been used, and corrections for enforcing the cusp con- 
ditions can be made to improve local behavior close to a nucleus. For example, 
in one approach |^^, the region close to a nucleus is described by a Slater-type 
function, and a polynomial fit is used to connect the Gaussian region to the 
exponential. This procedure strongly reduces fiuctuations of the kinetic energy 
of these functions, a desirable property for guided VMC and Green's function 
methods. 

The symmetric part of the wave function is usually built as a product of 
terms expHcitly dependent on inter-particle distance, — |r.i — rj\. These 
functions are usually constructed to reproduce the form of the wave function 
at electron-electron and electron-nucleus cusps. A now familiar form is that 
proposed by | ^ , ^ and known as the Jastrow ansatz., 

= e'^('''^) = en.<.»'^, (34) 
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where the correlation function gij is, 



with constants specified to satisfy the cusp conditions, 
i , if ij are like spins 

i , if ij arc unlike spins (36) 
1 , if ij arc electron/ nucleus pairs 

Electron correlation for parallel spins is taken into account by the Slater 
determinant. 

This simple Slater-Jastrow ansatz has a number of desirable properties. 
First, as stated above, scaHng with system size for the evaluation of the trial 
function is N^, where N is the number of particles in the system, the correct 
cusp conditions are satisfied at two-body coalescence points and the correla- 
tion function gij approaches a constant at large distances, which is the correct 
behavior as rij oo. 

In general, the inclusion of 3-body and 4-body correlation terms has been 
shown to improve wave function quality. The work of |Q shows that if the 
determinant parameters Xd are optimized along with the correlation function 
parameters, Ac, one finds that the nodal structure of the wave function does not 
improve noticeably in going from 3- to 4-body correlation terms, which suggests 
that increasing the number of determinants. No is more important than adding 
fourth- and higher-order correlation terms. 

The use of Feynman-Cohen backflow correlations |Q, which has been sug- 
gested for the inclusion of three body correlations in U, has been used in 
trial functions for homogeneous systems such as the electron gas ||95| , Q and 
Hquid hehum |^, Feynman |Q suggested replacing the orbitals by func- 
tions that include hydrodynamic backflow effects. His idea was based on the 
conservation of particle current and the variational principle. The procedure 
involves replacing mean field orbitals by 6acA;/i!ow-corrected orbitals of the form, 

(j^ni^i) 0„(ri + ^ r,y i^(rij)), (37) 

where i'(ry ) is the backflow function. Others iQ proposed that i^{rij) should 
consist of the difference between the I = and I = 1 states of an effective two- 



particle Schrodinger equation. Furthermore, they proposed |10C] the inclusion 
of a tail, as originally suggested by Feynman and Cohen, 



j/(r) = A,e"[^] (38) 

where, X^, X^' , and uj,^ are variational parameters. As recently noted the 
incorporation of the full backflow trial function into wave functions involves a 



A. 
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power of N increase in computational expense, but yields a better DMC energy 
for the electron gas^. 

Recently, one has seen the practice of taking orbitals from a mean field cal- 
culation and the inclusion of averaged backflow terms in the correlation function 
!F. The advantage of this approach is that orbitals are unperturbed and readily 
obtainable from mean field computer codes. 

The correlation function form used by Q| is a selection of certain terms 
of the g ener al form originally proposed in connection with the transcorrelated 
methodjlQll, 

J^^e'^'.'<^""\ (39) 

where 



N{I) 



k 

The sum in ^ goes over / nuclei, ij electron pairs, and the sum in Eq. ^ is 
over the N[T) terms of the correlation function for each nucleus. The parameters 
m, n and o are integers. The function A(m,n) takes the value 1 when m ^ n, 



and i otherwise. The functions ^-re specified by Eq. 35 



This correlation function |39|, |4C| can be shown to have contributions to aver- 
aged backflow effects from the presence of electron-electron-nucleus correlations 
that correspond to values of m, n and o in Eq. |o| of 2,2,0 and 2,0,2. These 
contributions recover « 25% or more of the total correlation energy of atomic 
and molecular systems above that from the simple Jastrow term |p4[. 



3.3 The variational Monte Carlo algorithm 

The VMC algorithm is an appHcation of the generalized MetropoHs MC method. 
As in most applications of the method, one needs to insure that the ensemble 
has achieved equilibrium in the simulation sense. EquiHbrium is reached when 
the ensemble W is distributed according to 7^(R) This is usually achieved by 
performing a Metropolis random walk and monitoring the trace of the observ- 
ables of interest. When the trace fluctuates around a mean, it is generally safe 
to start averaging in order to obtain desired properties. 
An implementation of the VMC algorithm follows: 

1. EquiHbration stage 

(a) Generate an initial set of random walker positions. Wo; it can be 
read in from a previous random walk, or generated at random. 

(b) Perform a loop over Ng steps. 



As discussed in Sec. 5.6, an improved fixed-node energy is a consequence of better nodes of 
the trial wave function, a critically important characteristic for importance sampling functions 
in QMC methods. 
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i. For each of the Np number of particles, 

A. Propose a move from ^'(R) = \l/(ri, r2, . . . , r^, . . . , r^r^^) to 
5'(R') = 5'(ri,r2, . . . ,r^, . . . ,rArp). Move from r to r' ac- 
cording to 



Qsr + ^F^St, (41) 



where Gst is a Gaussian random number with standard devi- 
ation (5t, which is a proposed step size, and Fq is the quantum 
force (see Eq. ^l|) . This is the Langevin dynamics of Eq. 

B. Compute the Metropohs acceptance/rejection probabihty, 

where Tl is given by Eq. |2^. 

C. Compare P(R R') with an uniform random number be- 
tween and 1, ^0,1]- If -P > ^o,i]j accept the move, other- 
wise, reject it. 

D. Calculate the contribution to the averages '^^ ^^-^^ ^ , and per- 



form blocking statistics as described in Sec. 3.1.2 



ii. Continue the loop until the desired accuracy is achieved. 



4 Wave function optimization 



Trial wave functions ^'t(R, A) for QMC are dependent on variational param- 
eters, A = {Ai,...,A„}. Optimization of A is a key element for obtaining 
accurate trial functions. Importance sampling using an optimized trial func- 
tion increases the efficiency of DMC simulations. There is a direct relationship 
between trial-function accuracy and the computer time required to calculate ac- 
curate expectation values. Some of the parameters Xi may be fixed by imposing 
appropriate wave function properties, such as cusp conditions (See Sec. ||). 

It is useful to divide A into groups distinguished by whether the optimization 
changes the nodes of the wave function. The Slater determinant parameters, 
Ax)Ti and the Slater determinant weights, Afc. change wave function nodal struc- 
ture 1 102, 103, 1^ 104, 2^. The correlation function parameters, Ajr do not 
change the nodal structure of the overall wave function, and therefore the DMC 
energy. For some systems, the optimization of Ajr is sufficient for building reli- 
able trial functions for PMC methods, because is designed in part to satisfy 
cusp conditions |^, p7|| . 

There have been several optimization m etho d s proposed pr e viou sly. Some 
involve the use of analytical derivatives |85, 105| , 106, 107, lOS, 109|, and oth- 
ers focus on the use of a fixed sample for variance minimization Conroy |11C|, 
and more recently others |102|, 111, 112|. Yet another direction is the use of 



histogram analysis for optimizing the energy, variance, and molecular geometry 
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for small systems ]113|. In the present study, we concentrate on fixed sample 



optimization to eliminate stochastic uncertainty during the random walk |111|. 

Other authors optimize the trial wave function using information obtained 
from a DMC random walk This approach shows promise, because usu- 

ally the orbitals obtained from a mean field theory, such as HF or LDA, are 
frozen and used in the DMC calculation without re-optimization specifically for 
correlation effects within the DMC framework. 



The common variance functional (VF) [102| is given by 



VF = 



2^2=1 



g1'(Ri,A) 
*(R.i,A) 



Wi 



N 



(43) 



where Et is a trial energy, Wi is a weighting factor defined by 



and Aq is an initial set of parameters. The sum in Eq. 
configurations. 



(44) 

is over fixed sample 



4.0.1 Trial wave function quality 

The over lap of with the ground state wave function, (^'tI^'o), by DMC 
methods, 1 115 1 is an very efficient way of assessing wave function quality. There 
is also a trend that correlates the variational energy of the wave function with 
associated variance in a linear relationship 1 95 , 9^ . This correlation is expected 
because both properties, Se, and Sel i approach limits - Eo and zero respectively 
- as wave function quality improves. Observing this correlations is a good 
method of validating the optimization method, as well as assessing wave function 
quality. 



5 Projector methods 

QMC methods such as DMC and GFMC are usefully called projector Monte 
Carlo (PMC) methods [|. The general idea is to project out a state of the 
Hamiltonian by iteration of a projection operator, P. For simplicity, we assume 
that the desired state is the ground state, '^o, but projectors can be constructed 
for any state, 

lim P'I^'t) ~ |*o)- (45) 

i — >oo 

After sufficient iterations i, the contribution of all excited states will 
be filtered out, and only the ground state is recovered. 



If I^t) is a vector and P is a matrix, then the procedure implied by 45 is the 



algebraic power method: If a matrix is applied iteratively to an initial arbitrary 



^The introductory section of this Chapter, follows the work of jll6| , 117, lisj 
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vector for a sufficient number of times, only the dominant eigenvector, |4'o), will 
survive. One can see for large i, 



(46) 



where Ao is the leading eigenvalue, and Ai is the largest sub-leading eigen- 
value. 

For this approach, it is possible to obtain an estimator of the eigenvalue, as 
described in ||6j, given by, 



5.0.2 Markov processes and stochastic projection 

For high dimensional vectors, such as those encountered in molecular electronic 
structure, the algebraic power method described previously needs to be general- 
ized with stochastic implementation. For this to occur, the projection operator 
must be symmetric, so that all eigenvalues are real. This is the case for QMC 
methods, because P is a function of the Hamiltonian operator, H, which is 
Hermitian by construction. 

A stochastic matrix is a normalized non-negative matrix. By normalization, 
we mean that the stochastic matrix columns add to one, J2i -^^u = 1- 
space representation would be a stochastic propagator M(R, R') that satisfies 
the condition 



A Markov chain is a sequence of states obtained from subsequent transitions 
from state i to j with a probability related to the stochastic matrix element 
Mij, in which the move only depends on the current state, i. For example, in 
R-space, this is equivalent to the following process 



The sequence of states S = {7r(R"), 7r(R'), 7r(R), . . .}is the Markov chain. 

The propagators of QMC for electronic structure are not generally normal- 
ized, therefore they are not stochastic matrices, but we can represent them in 
terms of the latter by factoring. 




(47) 




(48) 




(49) 



(50) 
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where the weights, Wj, are defined by Wj = J2i Pij- This definition unambigu- 
ously defines both, the associated stochastic matrix M and the weight vector 
w. 

A MC sampHng scheme of P^I^'t) can be generated by first performing a 
random walk, and keep a weight vector, iy(R) for the random walkers. 



*(R') = 7r(R')VK(R') = J P{'R','R"}^(R")d'R" = J M{'R','R")B{'R')^('R")dR" 
*(R) = 7r(R)VK(R) = / P(R,R')'I'{R')rfR' = / A/{R,R')-B(R)'^(R')rfR-' (51) 



Here, -B(R) is the function that determines the weight of the configurations 
at each state of the random chain. This leads to a generalized stochastic pro- 
jection algorithm for unnormaHzed transition probabilities that forms the basis 
for population Monte Carlo (PopMC) algorithms, which are not only used for 
QMC, but they are also used for statistical information processing and robotic 



vision 1 119 1 . A generalized PopMC stochastic projection algorithm, represented 



in R-space follows, 

1. INITIALIZE 

Generate a set of n random walkers, located at different spatial positions, 
W = {Ri, R2, . . . , Rn}, where Ri denotes a Dirac Delta function at that 
point in space, (5(R — Ri). These points are intended to sample a proba- 
bility density function <&(R). 

2. MOVE 

(a) Each walker j is moved independently from R to a new position R', 
according to the transition probability 

T(R^ R') = Af(R,R') (52) 

(b) Ensure detailed balance if r(R — > R') 7^ T(R' — > R), by using a 
MetropoHs acceptance/rejection step as in Eq. ^ 

3. WEIGHT 

(a) Calculate a weight vector using a weighting function B(R,i) 

w* = S(R,) (53) 

The ideal weight function preserves normalization of -P(R, R') and 
maintains individual weights, Wi close to unity. 

(b) Update the weight of the walker, multiplying the weight of the pre- 
vious iteration by the weight of the new iteration, 

w- = w* * Wi (54) 



18 



4. RECONFIGURATION 



(a) Split walkers with large weights into multiple walkers with weights 
that add up to the original weight. 

(b) Remove walkers with small weight. 

Step 4 is necessary to avoid statistical fluctuations in the weights. It is a form of 
importance sampHng in the sense that makes the calculation stable over time. 
Some algorithms omit this step; see, for example, eff orts by |Q, but it has 
been proved that such calculations eventually diverge [|l2C| . There is a slight 
bias associated with the introduction of step 4 together with population control 
methods, that will be discussed in Sec. |5.3| . When step 4 is used, -B(R) is also 
referred in the literature as a branching factor. 

It is important to recall that PopMC algorithms are not canonical Markov 



Chain Monte Carlo (MCMC) algorithms [|l2l| , |6S(], in the sense that the prop- 
agator used is not normalized, and therefore factoring the propagator into a 
normalized transition probability and a weighting function is required. 

5.0.3 Projection operators or Green's functions 

Different projection operators lead to different QMC methods. If the resolvent 
operator, 

P{H) = 1 , (55) 

l + SriH-ERY 



is used, one obtains Green's function Monte Carlo (GFMC) pi 122 1. This 



algorithm will be described in Sec. 5.5 



If the imaginary time evolution operator is used, i.e., 

P(i7) = e-^"(^-^«), (56) 



one has the DMC method ||l23| , |l24|] , which is discussed in Sec. |5.1.2 



For flnite St, and for molecular systems, the exact projector is not known 
analytically. In GFMC, the resolvent of Eq. |5^ is sampled by iteration of a 
simpler resolvent, whereas for DMC, the resolvent is known exactly at r ^ , 
so an extrapolation to 6t ^ is done. 

Note that any decreasing function of H can serve as a projector. Therefore, 
new QMC methods still await to be explored. 

5.1 Imaginary propagator 

If one transforms the time-dependent Schrodinger equation (Eq. ^ to imaginary 
time, i. e., 

it^T (57) 

then one obtains 
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^^{II,t) = (H - En)^ (R, r) . (58) 

Here is an energy offset, called the reference energy. For real \E'(R, r), 
this equation has the advantage of being an equation in TZ^ , whereas Eq. g, 
has in general, complex solutions. 

Equation ^s] can be cast into integral form, 

^'(R, T + ST) = Xr J G(R, R', ,5r)*(R', r)dR' (59) 

The Green's function, G(R ,R, Jr), satisfies the same boundary conditions 
as Eq.. 

^G(R, R', St) ^{H- Et)G{I{., R', 5t) (60) 

with the initial conditions associated with the propagation of a Dirac delta 
function, namely, 

G(R, R', 0) = (5(R - R') (61) 
The form of the Green's function that satisfies Eq. subject to |6l| is, 

G(R, R', 5t) ^ (R|e-"(^-^"' |R) (62) 

This operator can be expanded in eigenfunctions, \E'a, and eigenvalues Ea 
of the system, 

G(R, R', 6t) = e""(^°~^«)*:(R')*a(R) (63) 

a 

For an arbitrary initial trial function, ^'(R), in the long term limit, r oo, 
one has 



lim e~^(^"-^^)* = lim / G(R', R, r)^'(R')rfR' = (64) 

lim (*|*o)e-"(^°-^«Vo, 

r — ^oo 

and only the ground state wave function '^o is obtained from any initial 
wave function. Therefore, the imaginary time evolution operator can be used 
as a projection operator as mentioned at the beginning of this Chapter. 



5.1.1 Diffusion Monte Carlo stochastic projection 

Due to the high dimensionality of molecular systems, a MC projection procedure 
is used for obtaining expectation values. In this approach, the wave function 
is represented as an ensemble of delta functions, also known as configurations, 
walkers, or psips (psi-particles): 
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— *^(5(R-Rk) (65) 

k 

The wave function is propagated in imaginary time using the Green's func- 
tion. In the continuous case, one can construct a Neumann series, 



*«(R..)^A,/G(R',R..-n)*«(R')^ 

*(3)(R,^) ^ Azy" G(R',R,r3-r2)5'(2)(R')(^R 

(...) (66) 

This Neumann series is a specific case of the PopMC propagation of Sec. 



5.0.2 . The discrete Neumann series can be constructed in a similar way: 

(R, r + Jr) <—> Afe ^ G^") (R, R', St) (67) 

k 

Therefore, a stochastic vector of configurations W = {Ri, . . . ,Rn} is used 
to represent ^'(R) and is iterated using G'"^(R, R', (5r). 



5.1.2 The form of the propagator 

SampUng Eq. ^ can not be done exactly, because the argument of the expo- 
nential is an operator composed of two terms that do not commute with each 
other. 

In the short-time approximation (STA), the propagator G(R, Rk,rfT) is ap- 
proximated as if the kinetic and potential energy operators commuted with each 
other. 



e(^+^)^- « e^^^ • e^*^ + 0((<5t)') = G5T = Gd-Gb (68) 

The Green's function then becomes the product of a diffusion factor, Gd 
and a branching factor Gb- Both propagators are known, 

GD = (2^r)-3^/2e-^^^, (69) 

and, 

GB=e-'"(^(^)~'^"^. (70) 

Gd is a fundamental solution of the Fourier equation (that describes a dif- 
fusion process in wave function space) and Gs is the fundamental solution of a 
first-order kinetic birth-death process. 

The Campbell-Baker-Haussdorff (CBH) formula. 
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-"^P^ = f,A+B+^[AM] + ^_[(A~B),[A,B]] + . 



(71) 



can help in constructing more accurate decompositions, such as an expansion 
with a cubic error 0{{5tY), 

^SriT+V) ^ ^Sr(V/2)^SrT^5riV/2) ^ 0((J^)3) (73) 



There are more sophisticated second order, [117|, and fourth order, [125, 126|| 



, expansions that reduce the error considerably and make more exact DMC 
algorithms at the expense of a more complex propagator. 

The most common implementation using G^i as a stochastic transition prob- 
ability r(R R'), and Gb as a weighting or branching factor, i?(R). Sampling 
of Eq. ^ can be achieved by obtaining random variates from a Gaussian dis- 
tribution of standard deviation Sr , Gst ■ 



5.2 Importance sampling 

Direct application of the algorithm of the previous section to systems governed 
by the Coloumb potential leads to large population fluctuations. These arise 
because the potential V^(R) becomes unbounded and induces large fluctuations 
in the random walker population. A remedy, importance sampling, was flrst 
used for GFMC in 1962 @ and extended to the DMC method in 1980 [||. 

In importance sampHng, the goal is to reduce fluctuations, by multiplying 
the probability distribution by a known trial function, ^fyCR), that is expected 
to be a good approximation for the wave function of the system. Rather than 
^'(R, r), one samples the product. 



/(R,r) =vI/j,(R)vI/(R,r) (73) 
Multiplying Eq. |5^ by 5't(R), one obtains, 

/(R, T + dr)^ J K{R', R, (5t)/(R', r)dR', (74) 

where i^(R, R',5r) = g^'^i^^-Er) ^^^^^}^ . Expanding in a Taylor series, 
at the 6t ^ Hmit, one obtains the expression. 



K = Are-(I^^-I^i + iVln*T(Ri)5r)V(25r) ^ ^~i^^i0^-ET)Sr _ ^ ^^^^ 

Equation ^ is closely associated to the product of the Kernel of the Smolu- 
chowski equation, which describes a diffusion process with drift, multiplied by a 
flrst order rate process. Here the rate process is dominated by the local energy, 
instead of the potential. The random walk is modifled by appearance of a drift 
term that moves conflgurations to regions of high values of the wave function. 
This drift is the quantum force of Eq. 
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The excess local energy {Et — ElCR)) replaces the excess potential energy in 
the branching term exponent, see The local energy has kinetic and potential 
energy contributions that tend to cancel each other, giving a smoother function: 
If 5't(R) is a reasonable function, the excess local energy will be nearly a 
constant. The regions where charged particles meet have to be taken care of by 
enforcing the cusp conditions on 5't(R.) (See Sec. 

The local energy is the estimator of the energy with a lower statistical vari- 
ance, so it is preferred over other possible choices for an estimator. A simple 
average of the local energy will yield the estimator of the energy of the quantum 
system^. 



(El) = J f{K,T^oo)EL{R)dR/ J f{R)dR (76) 



*(R)*t(R) 



i7*T(R) 



dR/ J *(R)^'t(R) 



*(R)iJ«'(R)dR/ / 5'(R)^'T(R)dR 



= Eo 

Therefore, a simple averaging of the local energy, will yield the DMC energy 
estimator: 



(El) = Jim (77) 



Because the importance sampled propagator, K{R, R', St), is only exact to 
a certain order, for obtaining an exact estimator is necessary to extrapolate to 
St = for several values of (El). 

Importance sampHng with appropriate trial functions, such as those used 
for accurate VMC calculations, can increase the efficiency of the random walk 
by several orders of magnitude. In the limit required to obtain the exact trial 
function, only a single evaluation of the local energy is required to obtain the 
exact answer. Importance sampling has made molecular and atomic calculations 
feasible. Note that the quantum force present in Eq. ^ also moves random 
walkers away from the nodal regions into regions of large values of the trial 
wave function, reducing the number of attempted node crossings by several 
orders of magnitude. 



5.3 Population control 

If left uncontrolled, the population of random walkers will eventually vanish or 
fill all computer memory. Therefore, some form of population control is needed 
to stabilize the number of random walkers. Control is usually achieved by slowly 



'For other energy estimators, refer to the discussion in |122| and 
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changing Et as the simulation progresses. As more walkers are produced in the 
procedure, one needs to lower the trial energy, Et, or if the population starts 
to decrease, then one needs to raise Et- This can be achieved by periodically 
changing the trial energy. One version of the adjustment is to use, 

ET^{E,)+aln^, (78) 

where < -Eq > is the best approximation to the eigenvalue of the problem to 
this point, a is a parameter that should be as small as possible while still having 
a population control effect, is the number of desired random walkers, and 
iV^ is the current number of random walkers. 

This simple population control procedure has a slight bias if the population 
control parameter a is large, or if the population is small. The bias observed 
goes as 1/Nw, and, formally a oo extrapolation is required. The bias is 

absent in the limit of an infinite population. 

A recently resurrected population control strategy, stochastic reconfiguration 



[127, 128 , 12£, 12C| originally came from a the work of |116|. In this algorithm, 
walkers carry a weight, but the weight is redetermined at each step to keep 
the population constant. The idea behind this method is to control the global 



weight w of the population, 

JV. 

n7 



The idea is to introduce a renormaHzed individual walker weight, uji, defined 



tO^=^. (80) 

w 

Another stochastic reconfiguration scheme proposes setting the number of 
copies of walker i for the next step, proportional to the renormalized walker 
weight, LUi. This algorithm has shown to have less bias than the scheme of Eq. 
|78| , and also has the advantage of having the same number of walkers at each 
step, simplifying implementations of the algorithm in parallel computers. 

5.4 DifFusion Monte Carlo algorithm 



There are several versions of the DMC algorithm. The approach presented 
here focuses on simplicity. For the latest developments, the reader is referred to 



the references |124, 130, |3| 



1. Initialize an ensemble W of iVyy configurations, distributed according to 
P(R) for ^t(R-); for example, use the random walkers obtained from a 
previous VMC run. 

2. For every configuration in W, 
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(a) Propose an electron move from ^(R) = 'i'(ri, r2, . . . , r^, . . . , rAr^) to 
4'(R') = 'i'(ri, r2, . . . , r,-, . . . , r^^). The short-time approximation 
propagator, /ir(R, R'; (5t), has an associated stochastic move, 

R' ^ R + Fq(R),5T + Gsr (81) 

(b) Enforce the fixed node constraint: if a random walker crosses a node, 
i.e. sign(5'T(R)) 7^ sign(^'T(R'))j then reject the move for the cur- 
rent electron and proceed to treat the next electron. 

(c) Compute the Metropolis acceptance/rejection probability 



P(R^RO^mmfl,^-(^'^'^;f|f)V 



(82) 



where Kd is the diffusion and drift transition probability given by 
Eq. ^. 

(d) Compare P(R R') with an uniform random number between 
and 1, ^[0,1], if -P > ^o.i]i accept the move, otherwise, reject it. 

Calculate the branching factor Gb for the current configuration 

P(R, R') = e^^"" = ' +^7(sr»*^ (83) 

Accumulate all observables, such as the energy. All contributions, Oi, are 
weighted by the branching factor, i. e., 

0(."+i) = o(?) + p(R, R')0,(R), (84) 

(n) 

where is the cumulative sum of the observable at step n. 

Generate a new generation of random walkers, reproducing the existing 
population, creating an average -B(R, R') new walkers out of a walker at 
R. The simplest procedure for achieving this goal is to generate n new 
copies of R where n — int(i?(R, R') + ^o.i])- 



Perform blocking statistics (see Sec. 3.1.2 ), and apply population control 
(see Sec. O) 



(a) One choice is to update the reference energy. En at the end of each 
accumulation block, 



Er^ Er + E%^Eb, (85) 

where E'f^ is a re-weighting parameter, usually chosen to be « 0.5, 
and Eb is the average energy for block B, Eb = E'^^"^ /Nb 

(b) Discard a relaxation time of steps, Nr^i, which is of the order of a 
tenth of a block, because moving the reference energy induces the 
most bias in about one relaxation time. 
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7. Continue the loop until the desired accuracy is achieved. 



Umrigar et al. ||l3l[| proposed several modifications to the above algorithm to 
reduce time-step error. These modifications concentrate on improving the prop- 
agator in regions where the short-time approximation performs poorly; namely, 
near wave function nodes and Coulomb singularities. These propagator errors 
are expected, because the short-time approximation propagator assumes a con- 
stant potential over the move interval, which is a poor approximation in \t' 
regions where the Coulomb interaction diverges. 

5.5 Green's function Monte Carlo 



The GFMC method is a QMC approach that has the advantage of having 
no time-step error. It has been shown to require more computer time than 
DMC, and therefore, has been applied less frequently than DMC to atomic and 



mol ecul ar sy stems. Good descriptions of the method can be found in [ pO| 132 
133| , |l3||. The GFMC approach is a PopMC method for which the projector 



for obtaining the ground state Green's function is the standard resolvent for the 
Schrodinger equation (see Eq. |5|). The integral equation for this case, takes 
the simple form. 



4- 



(n+l) _ 



Et + Ec 



^(») (86) 



H + Ec 

where the constant Eq is positive and fulfills the condition that \Ec\ > \Eo\, 
and Et is a trial energy. The resolvent of Eq. ^ is related to the DMC 
propagator by the one-sided Laplace transform, 

= Te-^^+^^'^dr (87) 

H + Ec Jo 

This integral is evaluated by MC. After equilibration, the sampled times 
have a Poisson distribution with a mean of ^ ^'^^ after Ns steps. Ec is a 
parameter that controls the average time step. 

The Green's function is not known in close form, so it has to be sampled by 
MC. This can be done by rewriting the resolvent in the form, 

^ = ^ + ^ (Hu - H)-^ (88) 

H + Ec Hjj + Ec Hu + Ec H + Ec 

The Hamiltonian Hu represents a family of solvable Hamiltonians. To sam- 
ple the Green's function, one samples the sum of terms on the right-hand side of 
Eq. 1^.. The Green's functions associated with H and Hu, satisfy the relations, 

(7? + £;c)G(R,R') = (^(R-R') (89) 
(iJc; + £;c)Gt/ (R, R') =<5(R-R') (90) 
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The most commonly used form of Hjj is, 

Hu = ^V^j + U (91) 

where U is, & potential that is independent of R. It is convenient to have 
G(7(^R, R') vanish at the domain boundary. Hu should be a good approximation 
to H in the domain to achieve good convergence. 

The R-space representation of Eq. |8| is 

G(R,R') = G,7(R,R')- / rfR"G(R,R") [-?^• VG,7(R",R)] 

Js 

+ f dR"G(R,R")[C/-l^(R")]Gc/(R",R') (92) 
Jv 



5.6 Fixed-node approximation 

We have not discussed the implications of the fermion character of ^'(R). It is 
an excited state in a manifold containing all the fermionic and bosonic states. 
A fermion wave function has positive and negative regions that are difficult to 



sample with the DMC algorithm as described in Sec. 5.4, Considering real wave 
functions, ^'(R) contains positive and negative regions, ^'+(R), and \I'~(R) 
that, in principle, could be represented as probabilities. The sign of the wave 
function could be used as an extra weight for the random walk. In practice, this 
is a very slowly convergent method. 

Returning to the importance sampled algorithm, recall that the initial distri- 
bution, |^'(R)p, is positive . Nevertheless, the Green's function, if(R, R'), can 
become negative, if a random walker crosses a node of the trial wave function. 
Again, the sign of K{Ii, R') could be used as a weight for sampHng |iir(R, R')l- 
The problem is that the statistics of this process lead to exponential growth of 
the variance of any observable. 

The simplest approach to avoid exponential growth is to forbid moves in 
which the product wave function, ^'(R)5't(R), changes sign. This boundary 
condition on permitted moves is the defining characteristic of the fixed-node 
approximation (FN A) . The nodes of the sampled wave function are fixed to be 
the nodes of the trial wave function. The FNA is an inherent feature of the 
DMC method, which is, by far, the most commonly used method for atomic 
and molecular MC applications | 12g| , 124 ]. 



The fixed-node energy is an upper bound to the exact energy of the system. 
In fact, it is the best solution for that fixed set of nodes. The DMC method 
has much higher accuracy than the VMC method. For atomic and molecular 
systems, it is common to recover 95 — 100% of the CE, cf Sec. |l|, whereas the 
CE recovered with the VMC approach is typically less than 80% of the total. 
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5.7 Exact methods 



Probably the most important algorithmic challenge that still remains to be ex- 
plored is the "node problem". Although progress has been made in systems that 



contain up to a dozen of electrons ||135| , |136| , |137| , [138| , |139| , a stable algorithm 
that can sample the exact wave function without resorting to the FNA remains 
to be determined. In this section, we discuss a family of methods that avoid 
the FNA. These approaches yield exact answers, usually associated with a large 
increase in computational time. 

The Pauli antisymmetry principle imposes a boundary condition on the wave 
function. It is the requirement that the exchange of like-spin electrons changes 
the sign of the wave function. This condition is a global condition that has to 
be enforced within an algorithm that only considers individual random walkers, 
i.e., a local algorithm. The FNA is the most commonly imposed boundary 
condition. It satisfies the variational principle, i.e, FN solutions approach the 
exact energy from above. This is an useful property, but one that does not assist 
the search for exact results, because there is not an easy way to parametrize 
the nodal surface and vary it to obtain the exact solution. We now describe 
methods that impose no additional boundary conditions on the wave function. 



5.7.1 The release node method 

The evolution operator, ^-^'^{h-Et) ^ symmetric and has the same form for 
both fermions and bosons. Straightforward application of it to an arbitrary 
initial wave function, |^o), leads to collapse to the ground state bosonic wave 
function, as can be seen from Eq. 

An arbitrary fermion wave function, ^'(R), can be separated into two func- 
tions, 'I'+(R), and *~(R), as follows, 

vI/±(R,r)^ip(R,r)|±vI,(R,r)]. (93) 
Note that the original trial wave function is recovered as, 

* (R, r) = *+ (R, r) - *+ (R, r) (94) 

The released node (RN) algorithm involves two independent DMC calcula- 
tions, using and '5^ as the wave functions to evolve. 



^'(R,t) =y"G(R,R',(5r)*(R',0)dR' = (95) 
j G{R,R',T)^+{R',0)dR' - J G(R,R',r)*-(R',0)(iR' = 

^+iR,T)~^-iR,T) 

The time evolution of the system can be followed from the difference of sep- 
arate simulations for \I>='=(R). Note that both distributions are always positive 
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during the simulation, and that they decay to the ground state bosonic wave 
function. This decay is problematic because the "signal-to-noise" ratio in this 
method depends on the difference between these two distributions. The decay 
of the difference ^'+(R, r) - ^'"(R,t) goes roughly as e^^^''"-^^^ where Ep is 
the lowest fermion state energy and Eb is the bosonic ground state energy. 

For this method to be practical, one needs to start with the distribution 
of a good fermion trial wave function. The distribution will evolve from this 
starting point to the bosonic ground state at large imaginary time r. In an 
intermediate "transient" regime one can collect information on the exact fermion 
wave function. 

The energy can be estimated from the expression. 



/^'+(R,r)i?*T(R)dR /*-(R, r)i?^'T (R)dR 



/ [«'+(R, t) - *-(R, t)] *T(R)dR / [*+(R, t) - "^-{H, t)] «'T(R)dR 

Ep 



In the release node method |136|, a fixed-node distribution is propagated as 
usual, but now two sets of random walkers are retained, Wfn ■, the fixed node 
ensemble and Wrn^ the released node ensemble. Walkers are allowed to cross 
nodes, and when they do, they are transferred from Wfn to Wrn- Also a 
account is made of the number of iterations that a walker has survived, Srn = 
{si, . . . , SAf„}. This index is used to bin the walkers by age. Each time a walker 
crosses a node, a summation weight associated with it, 0.rn — • . . ,^nJ\ 
changes sign. These weights determine the sign of the walker contribution to 
global averages. 

The released node energy can be calculated using the estimator, 

^i^N — • (97) 



1 = 1 



5.7.2 Fermion Monte Carlo 

From the previous section one can infer that if a method in which the dis- 
tribution does not go to the bosonic ground state, but stays in an interme- 
diate regime, will not have the deficiency of exponential growth of "signal to 
noise". T h is leads to the fermion Monte Carlo (FMC) method. The approach 



[ [140| , |l4l| , [142| , |l43| , |l44| involves correlated random walks that achieve a con- 
stant "signal to noise". 

The expectation value of Eq. |9^ for an arbitrary distribution of signed 
walkers can be rewritten as 
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^^^^^^^ = v.^^. ';.(V) ^.(^n j (^^^ 

where ^'^(R''') are the guiding functions for a pair of random walkers Pi = 
{R^,R,~}. Note that the variance of the energy estimator of Eq. |9| goes to 
infinity as the difference between the two populations goes to zero, i. e. the 
denominator, 

goes to zero as the simulation approaches to the bosonic ground state. A pro- 
cedure that would not change {Efmc) would be to cancel positive and negative 



random walkers whenever they meet |135|. Although random walks are guaran 



teed to meet in one dimension, they need not meet in several dimensions, due 
to the exponentially decaying walker density in R-space. Besides, cancellation 
has to be combined with other procedures to insure a stable algorithm. 

Cancellation can be increased by introducing correlation between the random 
walkers. Recall the diffusion step in DMC, in which walkers diffuse from R 
to R' following Gd of Eq. |6^. In the DMC algorithm, this is implemented 
stochastically by updating the coordinates of the random walkers with a random 
displacement taken from a Gaussian distribution with a variance of 5t, 

R'+ ^ R+ + g+ and R" ^ R" + . (100) 

If we introduce correlation between the Gaussian vectors, and G^^, the 
expectation value of Eq. ^ is not affected, because it is linear in the density of 
random walkers. 

An efficient cancellation scheme can be achieved if the Gaussian vectors are 
correlated as follows, 

Gsr = QtrU- 2(01 ■ III: lib ■ (R+ - R-), (101) 



Equation 101 accounts for reflection along the perpendicular bisector of the 
vector that connects the pair, R+ — R~. This cancellation scheme generates 
a correlated random walk in one dimension along the vector R+ — R~. This 
one-dimensional random walk is independent of the number of dimensions of the 
physical system, and therefore, overcomes the cancellation difliculties mentioned 
above. Walkers are guaranteed to meet under these conditions. 

The modiflcations to the DMC algorithm mentioned to this point are nec- 
essary, but not sufficient for achieving a stable algorithm. If one were to in- 
terchange the random walker populations, {R^", . . . , R-at^} ^ {R-F' ■ ■ • > R-JV^i' 
the fictitious dynamics would not be able to distinguish between the two popu- 
lations, leading to a random walk with two degenerate ground states. Namely, 
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a ground state in which all the positive walkers, R"*" are marginally on the 
positive region of the wave function, and vice versa, {^'+(R+), 4'~(R~)} and 
{4'+{R^}, ^^(R+)}. This plus-minus symmetry can be broken by using two 
distinct guiding functions. For example, the guiding function 



*± = v/*l(R-) + c2*a(R-) ± c*a(R), (102) 
where 4's(R) is a symmetric function under permutation of electron labels; 
\l/yi(R) is an antisymmetr ic fu nction, and c is a small adjustable parameter. The 



guiding functions of Eq. |102| are almost equal, which provides nearly identical 
branching factors for the walker pair. It is positive everywhere, a requirement for 
the DMC algorithm, and it is symmetric under permutation of the coordinates, 
*+(PR) = *5(R). 

The use of different guiding functions is the last required ingredient for a 
stable algorithm. It breaks the plus-minus symmetry effectively, because the 
drift dynamics is different because the quantum force of Eq. ^ is distinct for 
each population. 

For a complete description of the FMC algorithm, the reader is referred to 



[143|. 



The denominator of Eq. |99| is an indicator of stability of the algorithm. 
It is a measure of the antisymmetric component of the wave function. FMC 
calculations have shown stable denominators for thousands of relaxation times, 
indicating the stability of the fermion algorithm. 



Early versions of the method [135| do not scale well with system size, due 
to the use of uncorrelated cancellation schemes. Nevertheless, researchers have 
been applied successfully to several small molecular systems obtaining solutions 



to the Schrodinger equation with no systematic error |137, 145| , I4t, 147 1. This 



version of the FMC algorithm, with GFMC propagation and without correlated 
dynamics, is known as exact quantum Monte Carlo (EQMC). 

5.8 Zero variance principle 

An increase in computational efficiency can be achieved by improving the observ- 
ables O by renormalizing them to observables that have the same expectation 



value, but lower variance. Recent work |148, 149 1 has shown that estimators for 



the energy and energy derivatives with respect to nuclear coordinates can be 
constructed. 

One can propose a trial operator Hy and auxiliary trial function, such 
that the evaluation of a renormalized observable O will have a variance that 
is smaller than that of the original observable O, and in principle can even be 
suppressed. 

To develop this concept, let us construct a trial operator Hy such that, 

J Hv{R, R')^/n{R')dR' ^ 0, (103) 

where 7r(R') is the MC distribution. For example, in VMC the MC dis- 
tribution is the wave function squared, ^'t(R)^, and in DMC it is the mixed 
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distribution of Eq. |7^. Next, propose a renormalized observable 0(R) related 
to the observable 0(R) given by, 



0(R) = d(R) 



fHv(R, R')V'y(R')'^R-' 



vMRO 

The mean of the rescaled operator is formally, 

/d(R)vr(R)dR+ -^^^"^"-^"'"'^"-^"'^'^"'^"' 

<o>= 



(104) 



(105) 



/ 7r(R)dR 

which, by property |104 is the same as the mean for the unnormaHzed oper 



ator: 



< O >=< O > 



(106) 



Operator O can be used as an unbiased estimator, even though statistical 
errors for O and O can be quite different. The goal of this kind of importance 
sampHng is to reduce the fluctuations by construction of such an operator. 

The implementation of the procedure requires the optimization of a set of 
parameters for the auxiliary trial wave function, \E'v(R, A\/), using the mini- 
mization functional, 



i7y(R,R')*y(R,Av)dR' -[0{x)- < O >]\M^ 



(107) 



After the parameters Ay are optimized, one can run a simulation to average 
O, in stead of O. The choice of auxiliary Hamiltonian suggested by recent work 
[01 is 



Hv{x,y) 



1 2 1 

2 ^+2^^ 



(108) 



Note that when Eq. 108 is appHed to -yA^R), the R' integration vanishes by 
construction. The choice of auxiliary wave function is open, and an interesting 
observation is that minimization of the normalization factor of 5'v(R), for any 
choice of auxiliary trial wave function, will reduce fluctuations in the auxiliary 
observable. 



a[df=a{6f 



^6(R) J ff(R,R')*v(R')dR'^ 
^ 77(R) ' 



J H(R,R')*v(R')<iR' 



(109) 



because the second term on the right hand side of Eq. |109| always has a 
negative sign. 

This variance reduction technique, applied to VMC and GFMC simulations, 
has achieved an order of magnitude reduction in computational effort |148|. It 
can also be used to calculate energy derivatives |14£|. 
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Part IV 

Special topics 

6 Fermion Nodes 

As discussed briefly in chapter ||, the simulation of a quantum system without 
approximation, obtaining exact results other than the numerical integration 
scheme's associated error bar is still an open research topic. Several solutions 
have been proposed, but the challenge is to have a general method that scales 
favorably with system size. 

For the ground state of a bosonic system, for which the wave function has 
the same sign everywhere, QMC provides an exact solution in a polynomial 
amount of computer time, i.e., is already a solved problem. The research in this 
field attempts to obtain an algorithm that has the same properties but that can 
treat wave functions that have both positive and negative regions, and therefore 
nodes, with the same favorable scaling. 



Investigation of nodes has been pursued by |150, |l5l| , |15^, |153| , |154| to un- 
derstand the properties of the nodes of fermion wave functions. 

The full nodal hyper-surfaces of a wave function, ^'(R), where R is a 3N 
dimensional vector and N is the number of fermions in the system is a 3N — 1 
dimensional function, 77(R). Of that function, symmetry requirements deter- 
mine a 3A^ — 3 dimensional surface, the symmetry sub-surface, (t(R). This is 
unfortunate, because even though that o'(R) C ?7(R), the remainder of the nodal 
surface, the peculiar nodal surface, ruIL) which is a function of the specific form 
of the nuclear and inter-electronic potential, is difficult to be known a priori for 
an arbitrary system. Note that o-(R) U ro(R) = ^^(R). 

Understanding nodal properties is important for further development of 
QM C m ethods: these shall be exploited for bypassing the node problem. 

|l5l| discusses general properties of wave function nodes. General properties 
of nodes follow. 

1. The coincidence planes 7r(ri ~ rj), are located at nodes when the two 
electrons have the same spin, ie. Saij =1. In more than 1 dimension, 
7r(R) is a scaffolding where the complex nodal surface passes through. 
Note that 7r(R) C cr(R). 

2. The nodes possess all the symmetries of the ground state wave function. 

3. The nodes of the many-body wave function are distinct from orbital nodes(/)i(r). 



see Sec. 3.2. 



For degenerate wave functions, the node positions are arbitrary. For a 
p-fold degenerate energy level, one can pick p ~ 1 points in R and find a 
linear transformation for which the transformed wave functions vanish at 
all but one of these points. 
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A nodal cell, ri(R) around a point R is defined as the set of points that 
can be reached from R without crossing a node. For potentials of our 
present interest, the ground state nodal cells have the tiling-property: any 
point R' not on the node is related by symmetry to a point in ri(R). This 
implies that there is only one type of nodal cell: all other cells are copies 
that can be accessed by relabeling the particles. This property is the 
generalization to fermions of the theorem that the bosonic ground state is 
nodewave functionless. 



[151 1 suggests that for DMC simulations benefit from the tiling property: one 
only needs to sample one nodal cell, because all the other cells are copies of 
the first. Any trial function resulting from a strictly mean field theory will 
satisfy the tihng property. Such as the local density approximation (LDA) wave 
fun ction s. 

|152| showed that simple HF wave functions for the first-row atoms were 
shown to have four nodal regions (two nodal surfaces intersecting) instead of 
two. This is attributed to factorizing the wave function into two distinct Slater 
determinants, and , each composed of two surfaces, one for the t and one 



for the I electron, as discussed in Sec. 3.2. 



Recently, after analysis of the wave functions for He, Li and Be, it was 



conjectured by |154|, that the wave function can be factored as follows, 

*(R) =iV(R)e^(^', (110) 

where A^(R) is antisymmetric polynomial of finite order, and /(R) is a pos- 
itive definite function. A weaker conjecture is that N may not be a polynomial, 
but can be closely approximated by a lower-order antisymmetric polynomial. 
The variables in which N should be expanded are the inter-particle coordinates. 

For example, for all ■^S states of two-electron atoms, the nodal factor A^(R) 



in Eq. 110 is 



7V(ri,r2) = ri-r2, (111) 
where ri and r2 are the coordinates of the two electrons. 



7 Treatment of heavy elements 



So far, we have not discussed the applicability of QMC to systems with large 
atomic number. There is an steep computational scaling of QMC methods, with 
respect to the atomic number Z. T he compu tational cost of QMC methods has 
been estimated to scale as Z^-^^^-^|155, |156| . This has motivated the replace- 
ment of the core electrons by ECPs. With thi s mo dification, the scaling with 
respect to atomic number is improved to Z^[155|. Other approaches involve 
the use of core- valence separation schemes] 157 1 and the use of model potentials 
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7.1 Effective core potentials 

In the ECP method ||l5|, |l6^, |l6^, [l6| the effect of the core electrons is 
simulated by an effective potential acting on the valence electrons. The effective 
Hamiltonian for these electrons is: 

^- = E^+E;^+E>^^w> (112) 

■I l<j I 

where i and 7 designate the valence electrons, Zeff is the effective nuclear charge 
in the absence of core electrons, and W is the pseudo-potential operator. The 
latter can be written, 

00 

>V(r) = ^W,(r)^ I ?m)(?m I, (113) 

1=0 m 

where I and m are the angular momentum and magnetic quantum numbers. 
The projection operator | lm){lm |, connects the pseudo-potential with 
the one-electron valence functions. A common approximation to this equation 
is to assume that the angular momentum components of the pseudo-potential, 
wi{v) do not depend on I when I > L, the angular momentum of the core. This 
approximation leads to the expression 



W(r) = >VL+i(r) + ^(>VKr)- WL+i(r))^ | lm){lm | . (114) 



The operator ( |114| ) can be appHed to a valence orbital, i.e., pseudo-orbital, 
4>i(j)- This function is usually represented by a polynomial expansion for dis- 
tances less than a cutoff radius, r < Tc, and by a fit to the all-electron orbital 
for r > re- 
Rapid fluctuations in the potential terms can cause the flrst order approxi- 
mation of Eq. p2 to break down, therefore, seeking a slowly varying form of ECP 



is relevant to QMC simulations. |164| proposed the use of norm-conserving soft 
ECPs for QMC. Soft ECPs derive their name from the property of being finite 
at the nucleus, this leads to a pseudo-orbital with no singularities at the origin 
in the kinetic energy. The associated effective potential has no discontinuities 
or divergences. 

7.2 Embedding methods 

A commonly used approach in wave function based methods, is to use embed- 
ding schemes, in which a region of high interest of a large system is treated 
by an accurate procedure, and the remainder is described by a less accurate 



method. Recent work |165| has extended the methodology to QMC methods. 
In this approach, a mean field calculation, for example HF, is performed for the 
whole system. An electron localization procedure is performed, the orbitals to 
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be correlated are chosen and separated from the remaining core orbitals. An 
effective Coloumb and exchange potential is constructed, Ve, which is added to 
the standard Hamiltonian of Eq. |l| to construct an effective Hamiltonian, He, 
that then is used in QMC calculations. Localization procedures similar to those 
required for ECPs are needed for representing the effect of non local terms. 
The effective Hamiltonian, He, takes the form. 

He — Hint + Vext + Jext + K^xt + Sext (H^) 

where Hint is the Hamiltonian for the QMC active region, Vext is the Coloumb 
potential exerted by the external nuclei, Jext represents the Coloumb repulsions. 
The term Kext represents the exchange interactions, and Sext is a shift opera- 
tor that prevents the wave function to be expanded into core orbitals, 4>c, by 
shifting their energy to infinity, and is given by 

int ext 

Sext^ lim AV V|</)p(r))((/.p(r)|dr. (116) 

A^oo ^ — ^ ^ — ^ 
a 13 

Here A is an effective orbital coupling constant that is derived from consid- 
ering single and double excitations into core and virtual orbitals of the system. 
The Coloumb term, Jext, and the external Coloumb potential, Vext, are local 
potentials, and can be evaluated within QMC without further approximation. 
The remaining terms require localization approximations that are discussed in 
detail in the original work. 



8 Other Properties 

8.1 Exact wave function and quantities that do not com- 
mute with the Hamiltonian 

Properties related to a trial wave function ^t(R-), are readily available in VMC 



calculations] 16^. In this case, expectation values are calculated from directly 
from Eqs. |l£ and |l|. The accuracy of the results obtained with VMC depend 
on the quality of ^'t(R-). 

For obtaining expectation values of operators that do not commute with the 
Hamiltonian in an importance sampled PMC calculation, one needs to extract 
the exact distribution ^'^(R) from the mixed distribution /(R) = ^'(R)5't(R). 
The expectation values for an operator O, (*(R)|0|*t(R)) and (*t(R)|0|*(R)) 
are different to each other. MC sampHng requires knowledge of the exact ground 
state distribution: a mixed distribution does not suffice to obtain the exact an- 
swer. 

If the operator O is a multiplicative operator, then the algorithms described 
in this section will be pertinent. Non-multiplicative operators, which are exem- 
plified by forces in this chapter, are described in Sec. ^.2|. 
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8.1.1 Extrapolation method 



An approximate procedure for estimating the ground state distribution can 
be extrapolated from the mixed and VMC distributions. This procedure is 
valuable because no extra changes are needed to the canonical VMC and PMC 
algorithms. Being an approximate method, it can fail even in very simple cases 



[ P7| , but also has provided very accurate results in more favorable cases |167|. 
The mixed estimator of a coordinate operator O, is, 

to be distinguished from the pure estimator, 

~ /*(R)vI/(R)dR ■ ^^^^^ 

We will label (O)t, the VMC estimator of Eq. |l6[ {0)m can be rewritten 
in a Taylor series in the difference between the exact and approximate wave 
functions, S'i/ = *(R) - *t(R), 

(0),n = (0)p + J ^'((O)p - 0(R))5*rfR + 0{S^^). (119) 
A similar expansion can be constructed for (0)y, 

= {d)p + 2 J *((0)p - d(R))5*dR + 0{5^^). (120) 



Combining Eqs. |119| and [120| , we can arrive to an expression with a second 
order error, 

(d)e = 2(0),„ - (O), = (0)p + 0{S^^), (121) 

where (0)e is an extrapolation estimate readily available from VMC and PMC 
calculations. 



8.1.2 Future walking 



The future walking method can be combined with any importance sampled PMC 
method that leads to a mixed distribution. If one multipHes both sides of Eq. 

The ratio is obtained 
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117 by the ratio ^'(R)/4't(R), one can recover Eq. 
from the asymptotic population of descendants of a single walker |168|. 

A walker in R-space can be represented as a sum of eigenfunctions of H, 



S{-R' - R) = *(R') c.(R)*«(R) 



(122) 



ri=0 



The coefficients q(R) can be obtained by multiplying Eq. g^by ^'(R')/*t(R') 
and integrating over R', 
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c,(R) = /i(R'-R)^.R'.|<a ,123) 

Clearly, we want to know the contribution to the ground state wave function, 
Co(R) of the walker at R. If propagated for sufficiently long time, all coefficients 
Ci(R) ^ Co(R) for the random walker will vanish. This can be seen from the 
decay in t of Eqs. |6^ and 

If we define Poo (R) to be the asymptotic population of walkers descended 
from a random walker at R, we find. 



Poo(R) = / co(R)e- 



4'(R')*t(RVR 



, _ ^(R) -(go-g^)r 

^-tIR) 



*(R)|*t(R)) 
(124) 



For obtaining Poo (R) in a PMC algorithm, one needs to keep a list of all 
the descendants of each walker Ri at each time step . The number of steps 
for which one requires to keep track of the descendants, iV^, is a critical pa- 
rameter. The statistical error of the asymptotic walker population grows in the 
limit oo, and if only few steps are used, a bias is encountered by non- 

vanishing contributions from excited states Ci(R) 7^ Co(R). Efficient algorithms 
for keeping track of the number of descendants can be found in the literature 
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170 



168 



171| 



The wave function overlap with the ground state can also be obtained with 
these methods, as shown by 1 172 1. These methods have been applied for ob- 
taining di pole moments ||l73|| , transition dipole moments |174| and oscillator 
strengths ||l75| , among other appHcations. 

Other methods for obtaining the exact distribution that are not discussed 
here for reasons of space are bilinear methods |17£|, and time correlation meth- 
ods [p77| . 



8.2 Force calculation 



Most QMC applications have been within the Born-Oppenheimer (BO) ||178 | 
approximation. In this approximation, the nuclear coordinates Tl are fixed at 
a certain position during the calculation 0. Therefore, the wave function and 
energy depends parametrically on the nuclear coordinates, EiJZ) and ^'(R, 7?.). 
We will omit this parametric dependence for the remainder of the discussion, 
and simplify the symbols to E and ^'(R), when appropriate. 

Forces are derivatives of the energy with respect to nuclear displacements. 



P(7^) = ^nE{Tl) 



(125) 



Because the stochastic nature of the algorithm, obtaining forces in QMC is 
a difficult task. Generally, QMC calculations at critical points, e.g., equilibrium 

^The QMC method can be used for calculations without t he R O approximation, but so far 
the applications have been to nodeless systems, such as H^z |179|. 
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and reaction barrier geometries have been carried out , in which the geome- 
tries are obtained with a different quantum chemical method such as density 
functional theory (DFT) ]180| , or wave function methods. Whereas DFT and 
wave function methods use the Hellman-Feynman theorem for the calculation of 
forces, a straightforward application of the theorem in QMC leads to estimators 
with very large variance. 



8.2.1 Correlated sampling 

An efficient approach for force calculation is the use of correlated sampling, 
which is a MC method that uses correlation between similar observations to 



reduce the statistical error of the sampling. If one were to represent Eq. 125 in 
a finite difference scheme for evaluating a derivative along r^, 

dE E{n + Vd)-E{n) 

~ , (126) 

then one obtains an approximate energy derivative along the r^. If two 
separate calculations are carried out, with a statistical error of the energies of 
a El the statistical error for the difference Od is approximately, 

fTd « — , (127) 

One can see that because r^ is a vector of a small perturbation, of « 0.01 a.u., 
that the statistical error of the difference will be several times higher than the 
statistical error of the energies. If r^ is sufficiently small, a single random walk 
can be performed, while evaluating the energy at the original and perturbed 
geometries, E\^{TVi\ and E\^{JZ + rd)\. In this case, both the primary iJZ) and 
secondary (TVj walks will be correlated, and therefore present lower variance 
than uncorrelated random walks. 

If correlated sampling is used for forces in a PMC algorithm, it was re- 



cently proposed [181| to use expressions including branching factors -B(R), re- 
optimizing the parameters of the wave function A for each perturbed geometry, 
and perform additional coordinate transformations. Practical implementations 



of th e cor related sampling method for derivatives are described in detail in ||182 | 



and |181|. 



8.2.2 Analytic derivative methods 

The calculation of analytic derivative estimators is a costly process, both for 
wave function based methods, and for QMC methods. Fortunately, in QMC 
one needs not to evaluate derivatives at each step, but rather sample points 
more sporadically, both for reducing computer time and serial correlation. 
The local energy estimator for a DMC mixed distribution is, 

F ,_ /^o(R)i^L(R)^T(R)dR 

^° - ^^^^ - — TOR)^^:(RjdR— 
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The gradient of expression 128 involves derivatives of the unknown exact 
wave function 5'o(R), and the trial wave function ^'t(R). Derivatives of 4'o(R) 
have to be obtained with a method devised for sampHng operators that do not 



commute with the Hamiltonian, which are described in Sec. 8.1. This can 



lead to an exact estimator for the derivative, but with the added computational 
complexity of those methods. Therefore, a simple approximation can be used, 
replacing the derivatives of (R) with those of \E't(R) to obtain. 



^nE. « {^nE.m + 2(£;,^|^) - 2i^„(^|^). (129) 



*t(R) 



The derivatives of 5*7- (R) are readily obtainable from the known analytic 
expression of 5't(R). 

The expression for the exact derivative involves the cumulative weight of 
configuration Ri at time step s, Rf, 



(130) 



where s — So is the number of generations for the accumulation of the cumu- 
lative weight, and -B(Ri ) is the PMC branching factor of Eqs. |5| and The 
energy expression using cumulative weights is. 



Eo^ 



J ^T(R)^g(R)-EL(R)rfR 
/^'T(R)2B(R)dR 



(131) 



The energy derivative of expression 131 leads to the following derivative 
expression. 



WtzEo = {WTzEL{n))+2{EL{Il) 

VkB(R 



(ElCR) 



B(R) 



-En 



Vr^t(R) 
*t(R) 

; V7^^(R) 

^ S(R) 



,V7?,*t(R)> . 



Analytic energy derivatives have been applied to H2 |16£|, LiH and CuH 



[183|. Higher order derivatives can be obtained as well. Details on the former 
can be found on refs. [184| and [|l85t . 



8.2.3 Hellman-Feynman derivatives and the zero variance theorem 

The Hellman-Feynman theorem states that the forces can be obtained by looking 
at the value of the gradient of the potential. 



{VnEo 



/*2(R)VKF(R)dR 
/«'2(R)dR ' 



(133) 
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where V^(R) is the Coloumb potential for the system. A QMC estimator of 
the Hellman-Feynman forces, Fhf = —'^nV{'R), can be constructed, but it has 
infinite variance. This comes from the fact that at short electron-nucleus dis- 
tances, r^jv, the force behaves as F^f ~ -j— , therefore the variance associated 

with Fhf depends on (Ff^p), which is infinite. Furthermore, the Hellman- 
Feynman theorem on ly ho lds for exact wave functions, and basis set errors need 
to be accounted for |18€|| . Also, the fixed-node approximation introduces an 
extra requirement on the nodal surface. The former has to be independent of 
the position of the nuclei, or i t ha s to b e the exact one. An elaborate discussion 
of this issue can be found in [187| and ||188| . 

A proposed solution to the infinite variance problem is to evaluate the forces 
at a cut off d istance close to the origin, and then extrapolation to a cutoff distance 
of zero |184| . This has the problem that the extrapolation procedure is difficult 
due to the increase of variance as the cutoff values decrease. 



As discussed in Sec. 5.8, renormalized operators can be obtained, in such a 
way that they have the same expectation value, but lower variance. Recently 



[149 1, a renormalized operator was introduced. 



' HF 



' HF 



^-tIR)' 



(134) 



Here, ^'y(R) are an auxiliary wav e function and Hv is an auxiliary Hamil- 
tonian. The variance of operator 134 can be shown to be finite, and therefore 
smaller than Fhf- The form of 5'y(R) proposed by |149| is a simple form 
that cancels the singularities of the force in the case of a diatomic molecule. 
Nevertheless, general forms of the auxiliary wave function can be constructed. 



Variational Monte Carlo dynamics 

Correlated sampling can be combined with a fictitious Lagrangian technique, 
similar to that developed by |18£| in a way first proposed by ]190| for geometry 
optimization. In this approach, the expectation value of the Hamiltonian is 
treated as a functional of the nuclear positions and the correlation parameters. 



(H) 



E[{A},{n}] 



(135) 



With the previous functional, a fictitious Lagrangian can be constructed of 
the form, 



Y,lMjnf-E[{A},{n}] 
I ^ 



(136) 



where M/ are the nuclear masses and \Xa are the fictitious masses for the 
variational parameters, Aq. The modified Euler-Lagrange equations can be 
used for generating dynamics for the sets of parameters, {7?,} and {A}, 
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Mjll'l 



dE 



(137) 
(138) 



A dissipative transformation of equations 137 and 138, where the masses M/ 
and jjLa are replaced by damped masses Mj and Jia can be used for geometry 
optimization. A more elaborate approach that attempts to include quantum 
effects in the dynamics is described in |191|. 

To conclude, a method that is not described here due to reasons of space 
and that needs to be explored further, is the generalized re-weighting method 
[|92|. 
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Part V 

List of symbols 

We denote: 

Zj Atomic nuclear charge, 

TZ Set of coordinates of clamped particles (under the Born-Oppenheimer ap- 
proximation): TZ = {TZi, . . . ,TZrn}, 

Ti Electronic coordinate in a Cartesian frame r, = {xi...Xd}, where the number 
of dimensions, is 3 for most chemical applications, 

R Set of coordinates of all n particles treated by QMC, R = {ri, . . . , r„}, 

(Tj Spin coordinate for an electron, a-\ is spin-up and a J, for spin-down particles, 

S Set of spin coordinates for particles, S = {ai, . . . , crjv}, 

Exact ground state wave function, 

'^i ith. exact wave function, 

'^T.i Approximate trial wave function for state i, 

(pk Single particle molecular orbital (MO), 

D{(j)k) Slater determinant of k MOs: D = -^det . . . ,0jv|, 

, {(f)\.) , Spin factored Slater determinants for spin up (t) and spin 
down (J.) electrons. 
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H Hamiltonian operator: H = T + V , 

T Kinetic energy operator — = — | Vf , 

V Potential energy operator, for atomic and molecular systems: V = — ■ ^ + 



T Imaginary time, t = it, 

p{r) Electronic density. 

El Local Energy, i?1'T(R)/*T(R), 

U[a,b] Uniform random variate in the interval [a,b], 

Qa Gaussian random variate of variance a, 

(Tq Monte Carlo variance for observable O 

W Ensemble of random walkers, W = {Ri, R2, . . . , Rn}, 

P(R) Monte Carlo probability density function, 

P(R — > R') Monte Carlo transition probability, 

i?(R, R') Branching factor for Population Monte Carlo algorithms. 

G(R, R'; T — r') Time dependent Green's function, 

(j'st(Rj R'; St) Time dependent short time Green's function for the Schrodinger 
equation 

Fq Quantum force, Fq = Vln|*T(R)^|, 

V Denominator in Fermion Monte Carlo, V = E^-'^'It^Stt^T^] 

Ml Ionic mass, 

fXce Fictitious parameter mass. 
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